Dynamic Hashing: Adaptive Metadata Management for Petabyte-scale File Systems∗
نویسندگان
چکیده
In a petabyte-scale file system, metadata access performance and scalability will significantly affect the whole system’s performance and scalability. We present a new approach called Dynamic Hashing (DH) for metadata management. DH introduces the RELAB (RElative LoAd Balancing) strategy to adjust the metadata distribution when the workload changes dynamically. Elasticity strategy is proposed to support the MDS cluster changes. WLM (Whole Lifecycle Management) strategy is presented to find hot spots in the file system efficiently and reclaim replicas for these hot spots when necessary. DH combines these strategies and Lazy Policies borrowed from the Lazy Hybrid (LH) metadata management together to form an adaptive, high-performance and scalable metadata management technique.
منابع مشابه
Intelligent Metadata Management for a Petabyte-scale File System
In petabyte-scale distributed file systems that decouple read and write from metadata operations, behavior of the metadata server cluster will be critical to overall system performance. We examine aspects of the workload that make it difficult to distribute effectively, and present a few potential strategies to demonstrate the issues involved. Finally, we describe the advantages of intelligent ...
متن کاملCalvinFS: Consistent WAN Replication and Scalable Metadata Management for Distributed File Systems
Existing file systems, even the most scalable systems that store hundreds of petabytes (or more) of data across thousands of machines, store file metadata on a single server or via a shared-disk architecture in order to ensure consistency and validity of the metadata. This paper describes a completely different approach for the design of replicated, scalable file systems, which leverages a high...
متن کاملScalable Archival Data and Metadata Management in Object-based File Systems
Online archival capabilities like snapshots or checkpoints are fast becoming an essential component of robust storage systems. Emerging large distributed file systems are also shifting to object-based storage architectures that decouple metadata from file I/O operations. As the size of such systems scale to petabytes of storage, it is critically important that file system features continue to o...
متن کاملPergamum : energy - efficient archival storage with disk instead of tape
Dr. Ethan L. Miller is an associate professor of computer science at the University of California, Santa Cruz, where he is a member of the Storage Systems Research Center (SSRC). His current research projects, which are funded by the NSF, Department of Energy, and industry support for the SSRC, include long-term archival storage systems, scalable metadata and indexing, issues in petabyte-scale ...
متن کاملDistributed Metadata Management Scheme in HDFS
A Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably and to stream those data sets at high bandwidth to user applications. Metadata management is critical to distributed file system. In HDFS architecture, a single master server manages all metadata, while a number of data servers store file data. This architecture can’t meet the exponentially increased stor...
متن کامل